---
title: Doctran
---

>[Doctran](https://github.com/psychic-api/doctran) is a python package. It uses LLMs and open-source
> NLP libraries to transform raw text into clean, structured, information-dense documents
> that are optimized for vector space retrieval. You can think of `Doctran` as a black box where
> messy strings go in and nice, clean, labelled strings come out.


## Installation and Setup

<CodeGroup>
```bash pip
pip install doctran
```

```bash uv
uv add doctran
```
</CodeGroup>

## Document Transformers

### Document Interrogator

See a [usage example for DoctranQATransformer](/oss/integrations/document_transformers/doctran_interrogate_document).

```python
from langchain_community.document_transformers import DoctranQATransformer
```
### Property Extractor

See a [usage example for DoctranPropertyExtractor](/oss/integrations/document_transformers/doctran_extract_properties).

```python
from langchain_community.document_transformers import DoctranPropertyExtractor
```
### Document Translator

See a [usage example for DoctranTextTranslator](/oss/integrations/document_transformers/doctran_translate_document).

```python
from langchain_community.document_transformers import DoctranTextTranslator
```
